Selective eye-gaze augmentation to enhance imitation learning in Atari games
نویسندگان
چکیده
This paper presents the selective use of eye-gaze information in learning human actions Atari games. Extensive evidence suggests that our eye movements convey a wealth about direction attention and mental states encode necessary to complete task. Based on this evidence, we hypothesize eye-gaze, as clue for direction, will enhance from demonstration. For purpose, propose augmentation (SEA) network learns when information. The proposed architecture consists three sub-networks: gaze prediction, gating, action prediction network. Using prior 4 game frames, map is predicted by network, which used augmenting input frame. gating determine whether should be fed final predict at current To validate approach, publicly available Human Eye-Tracking And Demonstration (Atari-HEAD) dataset 20 games with 28 million demonstrations 328 eye-gazes (over frames) collected four subjects. We demonstrate efficacy compared state-of-the-art Attention Guided Imitation Learning (AGIL) Behavior Cloning (BC). results indicate approach (the SEA network) performs significantly better than AGIL BC. Moreover, significance through compare random selection gaze. Even case, better, validating advantage selectively using demonstration learning.
منابع مشابه
Learning to Play Atari Games
Teaching computers to play video games is a complex learning problem that has recently seen increased attention. In this paper, we develop a system that, using constant model and hyperparameter settings, learns to play a variety of Atari games. In order to accomplish this task, we extract object features from the game screen, and provide these features as input into reinforcement learning algor...
متن کاملAtari Games and Intel Processors
The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage ActorCritic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions. In this work we...
متن کاملThe Impact of Determinism on Learning Atari 2600 Games
Pseudo-random number generation on the Atari 2600 was commonly accomplished using a Linear Feedback Shift Register (LFSR). One drawback was that the initial seed for the LFSR had to be hard-coded into the ROM. To overcome this constraint, programmers sampled from the LFSR once per frame, including title and end screens. Since a human player will have some random amount of delay between seeing t...
متن کاملDistributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes
We present a study in Distributed Deep Reinforcement Learning (DDRL) focused on scalability of a state-of-the-art Deep Reinforcement Learning algorithm known as Batch Asynchronous Advantage ActorCritic (BA3C). We show that using the Adam optimization algorithm with a batch size of up to 2048 is a viable choice for carrying out large scale machine learning computations. This, combined with caref...
متن کاملDeep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games
Monte Carlo Tree Search (MCTS) methods have proven powerful in planning for sequential decision-making problems such as Go and video games, but their performance can be poor when the planning depth and sampling trajectories are limited or when the rewards are sparse. We present an adaptation of PGRD (policy-gradient for rewarddesign) for learning a reward-bonus function to improve UCT (a MCTS a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Computing and Applications
سال: 2021
ISSN: ['0941-0643', '1433-3058']
DOI: https://doi.org/10.1007/s00521-021-06367-y